Many of the ecological and environmental processes of interest can be represented by a spatial point process or can be viewed as an aggregation of one.
Consider a fixed geographical region \(A\).
The set of locations at which events occur are denoted by \(\mathbf{s} = (\mathbf{s}_1, \ldots, \mathbf{s}_n)\).
We let \(N(A)\) be a random variable which represents the total number of events in every subset of region \(A\).
Our primary interest is in measuring where events occur, so the locations are our data.
There are three broad types of spatial Spatial point patterns which can be explored, each representing a different type of spatial dependence.
Complete spatial randomness (CSR) - events occur at random, and independently of each other.
Clustered process - events occur close to existing events.
Regular process - events occur away from existing events.
Locations of isopods burrows in the northern Negev Desert, Israel
Locations of isopods burrows in the northern Negev Desert, Israel
Locations of isopods burrows in the northern Negev Desert, Israel.
If we look at the complete data set we can see a strong aggregation pattern
Isopod burrow data with \(n=2015\) individual burrows in a study area with an extent of 75,600 m\(^2\). Inset maps show two subsets with reduced extent of 6.4 m\(^2\) each. Source: (Dungan et al. 2002).
We can define the (first order) intensity of a point process as the expected number of events per unit area.
This can also be thought of as a measure of the density of our points.
In some cases, the intensity will be constant over space (homogeneous), while in other cases it can vary by location (inhomogeneous or heterogenous).
If our intensity is homogeneous, we can define it as
\[ \lambda(s) = \frac{\mathbb{E}[N(A)]}{|A|} = \frac{\lambda |A|}{|A|} \lambda. \]
We can use the concept of intensity to help us define complete spatial randomness (CSR).
Uniformity and Independent scattering : Given the number of events \(N(A) = n\) in a region, the \(n\) events are independently and uniformly distributed over space (i.e., each event has an equal probability of occurring anywhere in the study area).
Poisson distribution of point counts: The number of points in any set \(A_i\) follows a Poisson distribution with mean \(\lambda|A_i|\), that is \[N(A_i) \sim \text{Poisson}(\lambda|A_i|).\]
A simplest point process model is the homogeneous Poisson process (HPP).
The likelihood of a point pattern \(\mathbf{y} = \left[ \mathbf{s}_1, \ldots, \mathbf{s}_n \right]^\intercal\) distributed as a HPP with intensity \(\lambda\) and observation window \(\Omega\) is
\[ p(\mathbf{y} | \lambda) \propto \lambda^n e^{ \left( - |\Omega| \lambda \right)} , \]
\(|\Omega|\) is the size of the observation window.
\(\lambda\) is the expected number of points per unit area.
\(|\Omega|\lambda\) the total expected number of points in the observation window.
We can contrast the observed point pattern against a point pattern generated from this CSR model to determine whether a HPP is appropriate for our data.
To do this, we calculate what is known as Ripley’s K-function.
Ripley’s K is a function of distance \(r\), and is given by
\[ K(r) = \frac{E[N(\mathbf{s}_0, r)]}{\lambda} \]
\(N(\mathbf{s}_0, r)\) denotes the number of events that occur within distance \(r\) of an event \(\mathbf{s}_0\).
Clearly, as \(r\) increases, so too will \(K(r)\).
Ripley’s \(K\) can be estimated as:
\[ \hat{K}(r) = \color{tomato}{\frac{1}{n}\sum_{i=1}^n\sum_{i\neq j}I(||s_i-s_j||<r)} \times \color{purple}{\lambda^{-1}} \]
If we have a homogeneous Poisson process, we would expect that
The expected number of points in any area is \(\lambda \times |A|\)
Therefore: \(E[\text{points within } r] = \lambda \times \pi r^2\)
\[ K_{CSR}(r) = \lambda \times (\pi r^2) \times \lambda^{-1} = \pi \times r^2 \]
That is, under CSR we would expect that the K function is equal to the area of the circle with radius \(r\).
When working with real data, some natural variation is to be expected even when CSR holds.
We therefore need an approach which accounts for this when assessing for CSR.
We can estimate \(\hat{K}(r)\) across a set of distances \(r\) for our set of observed events.
Our \(\hat{K}(r)\) can then be compared to the theoretical function under CSR, \(K_{CRS}(r) = \pi r^2\).
If the two functions are similar, then CSR is reasonable.
Spatial clustering
CSR
Regular pattern
Spatial clustering
If \(\hat{K}(r) > K_{CSR}(r)\) it means that more points are found within a radius \(r\) than what would be expected under CSR, suggesting a clustering pattern.
CSR
Regular pattern
Spatial clustering
CSR
If \(\hat{K}(r) = K_{CSR}(r)\) then our PP is a realization of an HPP.
Regular pattern
Spatial clustering
CSR
Regular pattern
If \(\hat{K}(r) < K_{CSR}(r)\), it indicates that the pattern is more regular since we observe fewer neighboring points within a distance \(r\) than expected under CSR.
While Ripley’s \(K\) function is widely used in environmental and ecological studies it has some caveats.
\(K(s)\) is a cumulative function, where all points less than \(r\) are also used.
Edge effects. This occurs because points near the boundaries of the study area have fewer neighboring points within distance \(r\), leading to underestimation of \(K(r)\)
So far we have assumed that the point process is stationary and isotropic.
The IPP has a spatially varying intensity \(\lambda(\mathbf{s})\) defined in terms of spatially varying covariates that are available across the whole study area:
\[ \lambda(s) = \mathrm{exp}(\alpha+\beta x(s) +\ldots ) \]
Let \(\mathbf{y} = s_1,\ldots,s_n\) the \(n\) number of observed events/points in an observation window \(\Omega\)
For an IPP with an intensity \(\lambda(s)\), the likelihood is given by:
\[ p(\mathbf{y} | \lambda) \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i). \]
If the case of an HPP the integral in the likelihood can easily be computed as \(\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} =|\Omega|\lambda\)
For an HPP with an intensity \(\lambda\), the log-likelihood is given by: \[ l(\beta;\mathbf{y}) = n\log(\lambda) -\lambda|\Omega|, \]
The maximum likelihood estimators is \(\hat{\lambda} = n/|\Omega|\).
For IPP, the integral in the likelihood has to be approximateda as a weighted sum.
This integral is approximated as \(\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \approx \sum_{j=1}^J w_j \lambda(\mathbf{s}_j)\)
\(w_j\) are the integration weights
\(\mathbf{s}_j\) are the quadrature locations.
This serves two purposes:
Approximating the integral
re-writing the inhomogeneous Poisson process likelihood as a regular Poisson likelihood.
The idea behind the trick to rewrite the approximate likelihood is to introduce a dummy vector \(\mathbf{z}\) and an integration weights vector \(\mathbf{w}\) of length \(J + n\)
\[\mathbf{z} = \left[\underbrace{0_1, \ldots,0_J}_\text{quadrature locations}, \underbrace{1_1, \ldots ,1_n}_{\text{data points}} \right]^\intercal\]
\[\mathbf{w} = \left[ \underbrace{w_1, \ldots, w_J}_\text{quadrature locations}, \underbrace{0_1, \ldots, 0_n}_\text{data points} \right]^\intercal\]
Then the approximate likelihood can be written as
\[ \begin{aligned} p(\mathbf{z} | \lambda) &\propto \prod_{i=1}^{J + n} \eta_i^{z_i} \exp\left(-w_i \eta_i \right) \\ \eta_i &= \log\lambda(\mathbf{s}_i) = \mathbf{x}(s)'\beta \end{aligned} \]
\[ \log~\lambda(s)= \mathbf{x}(s)'\beta + \xi(s) \]
How do we model \(\xi(s)\) ?
inlabru has implemented some integration schemes that are especially well suited to integrating the intensity in models with an SPDE effect.In this example we model the location of forest fires in the Castilla-La Mancha region of Spain between 1998 and 2007.
We are now going to use the elevation as a covariate to explain the variability of the intensity \(\lambda(s)\) over the domain of interest and a spatially structured SPDE model.
\[ \log\lambda(s) = \beta_0 + \beta_1 \text{elevation}(s) + \xi(s) \]
The IPP Model \[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) \\ \eta(s) & = \log ~\lambda(s) = \beta_0 + x(s) \end{aligned} \] The code
The LGCP Model
\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) \\ \eta(s) & = \log(\lambda(s)) = \color{#FF6B6B}{\boxed{\beta_0}} + \color{#FF6B6B}{\boxed{ x(s)}} + \color{#FF6B6B}{\boxed{ \omega(s)}}\\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) + elev(elev_raster, model = "linear") +
space(geometry, model = spde_model)
# define model predictor
eta = geometry ~ Intercept + elev + space
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = pp,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) \\ \color{#FF6B6B}{\boxed{\eta(s)}} & = \log(\lambda(s)) = \color{#FF6B6B}{\boxed{\beta_0 + x(s) + \omega(s)}}\\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) + elev(elev_raster, model = "linear") +
space(geometry, model = spde_model)
# define model predictor
eta = geometry ~ Intercept + elev + space
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = pp,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} \color{#FF6B6B}{\boxed{p(\mathbf{y} | \lambda)}} & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) \\ \eta(s) & = \log(\lambda(s)) = \beta_0 + x(s) + \omega(s)\\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) + elev(elev_raster, model = "linear") +
space(geometry, model = spde_model)
# define model predictor
eta = geometry ~ Intercept + elev + space
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = pp,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) \\ \eta(s) & = \log(\lambda(s)) = \beta_0 + x(s) + \omega(s)\\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) + elev(elev_raster, model = "linear") +
space(geometry, model = spde_model)
# define model predictor
eta = geometry ~ Intercept + elev + space
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = pp,
ips = ips)
# fit the model
fit = bru(cmp, lik)Model Predictions
We do not always observed all the events. Specially in Ecology!
Implications:
Distance sampling is a family of related methods for estimating the abundance and spatial distribution of wild populations.
Distance sampling is based on the idea that animals further away from observers are harder to detect than animals that are nearer.
This idea is implemented in the model as a detection function that depends on distance.
Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.
Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.
Traditionally, this is achieved in a two-stages approach by (i) using a detectability point estimates to create an offset vector to (ii) use within GLM or GAM for count response data.
This requires binning the data into counts based on some discretisation of space.
Coupling distance sampling data with spatial modelling allows maps of spatially varying density to be produced.
Traditionally, this is achieved in a two-stages approach by (i) using a detectability point estimates to create an offset vector to (ii) use within GLM or GAM for count response data.
A major downside to this approach is the propagation of uncertainty from the detection model to the second-stage spatial model.
The LGCP is a flexible approach that can include spatial covariates to model the mean intensity and a mean-zero spatially structured random effect to account for unexplained heterogeneity not captured by the covariates.
To account for the imperfect detection of points we specify a thinning probability function \[g(s) = \mathbb{P}(\text{a point at s is detected}|\text{a point is at s})\]
A key property of LGCP is that a realisation of a point process with intensity \(\lambda(s)\) that is thinned by probability function \(g(s)\), follows also a LGCP with intensity:
\[ \underbrace{\tilde{\lambda}(s)}_{\text{observed process}} = \underbrace{\lambda(s)}_{\text{true process}} \times \underbrace{g(s)}_{\text{thinning probability}} \]
Lets visualize this on 1D: Intensity function with points
Intensity (density) function with points and transect locations
Detection function \(\color{red}{g(s)}\)
Here \(\color{red}{g(s) =1}\) on the transects (at x = 10,30 and 50).
The detection function describes the probability \(\color{red}{p(s)}\) that an point is detected
Observations are from a thinned Poisson process with intensity \(\lambda(s) \color{red}{p(s)}\)
Half-normal: \(g(\mathbf{s}|\sigma) = \exp(-0.5 (d(\mathbf{s})/\sigma)^2)\)
Hazard-rate :\(g(\mathbf{s}|\sigma) = 1 - \exp(-(d(\mathbf{s})/\sigma)^{-1})\)
\[ \pi(\mathbf{s_1},\ldots,\mathbf{s_m}) = \exp\left( |\Omega| - \int_{\mathbf{s}\in\Omega}\lambda(s)g(s)\text{d}s \right) \prod_{i=1}^m \lambda(\mathbf{s}_i)g(\mathbf{s}_i) \]
To make \(g(s)\) and \(\lambda(s)\) identifiable, we assume intensity is constant with respect to distance from the observer.
If the strips width ( \(2W\) ) is narrow compared to study region (\(\Omega\)) we can treat them as lines.
Define the Poisson process likelihood along the kronecker spaces (line \(\times\) distance)
Accounting for imperfect detection the thinned Poisson process model on (space, distance) along the transects becomes:
\[ \begin{aligned} \log \tilde{\lambda}(s,\text{distance}) &= \overbrace{\mathbf{x}'\beta + \xi(s)}^{\log \lambda(s)} + \log \mathbb{P}(\text{detection at }s|\text{distance},\sigma) + \log(2)\\ \mathbb{P}(\text{detection}) &=1-\exp\left(-\frac{\sigma}{\text{distance}}\right) \end{aligned} \]
Here \(\log 2\) accounts for the two-sided detection.
Typically \(\mathbb{P}(distance)\) is a non-linear function, that is where inlabru can help via a Fixed point iteration scheme (further details available in this vignette)
we define \(\log (\sigma)\) as a latent Gaussian variable and iteratively linearise it.
In the next example, we will explore data from a combination of several NOAA shipboard surveys conducted on pan-tropical spotted dolphins in the Gulf of Mexico.
A total of 47 observations of groups of dolphins were detected. The group size was recorded, as well as the Beaufort sea state at the time of the observation.
Transect width is 16 km, i.e. maximal detection distance 8 km (transect half-width 8 km).
First, we need to create the mesh used to approximate the random field. We can either:
sf boundary and specify this directly into the mesh construction via the fm_mesh_2d functionmax.edge for maximum triangle edge lengthscutoff to avoid overly small triangles in clustered areasWe use the inla.spde2.pcmatern to define the SPDE model using PC priors through the following probability statements
\(P(\rho < 50) = 0.1\)
\(P(\sigma > 2) = 0.1\)
We start by plotting the distances and histogram of frequencies in distance intervals.
Then, we need to define a half-normal detection probability function. This must take distance as its first argument and the linear predictor of the sigma parameter as its second:
The LGCP Model
\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \color{#FF6B6B}{\boxed{\beta_0}} + \color{#FF6B6B}{\boxed{ \omega(s)}} + \color{#FF6B6B}{\boxed{ \log p(s)}} \\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) +
space(main = geometry, model = spde_model) +
sigma(1,
prec.linear = 1,
marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
)
# define model predictor
eta = geometry + distance ~ space +
log(hn(distance, sigma)) +
Intercept + log(2)
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = mexdolphin$points,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} p(\mathbf{y} | \lambda) & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \color{#FF6B6B}{\boxed{\eta(s)}} & = \color{#FF6B6B}{\boxed{\beta_0 + \omega(s) + \log p(s)}}\\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) +
space(main = geometry, model = spde_model) +
sigma(1,
prec.linear = 1,
marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
)
# define model predictor
eta = geometry + distance ~ space +
log(hn(distance, sigma)) +
Intercept + log(2)
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = mexdolphin$points,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} \color{#FF6B6B}{\boxed{p(\mathbf{y} | \lambda)}} & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \beta_0 + \omega(s) + \log p(s) \\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) +
space(main = geometry, model = spde_model) +
sigma(1,
prec.linear = 1,
marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
)
# define model predictor
eta = geometry + distance ~ space +
log(hn(distance, sigma)) +
Intercept + log(2)
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = mexdolphin$points,
ips = ips)
# fit the model
fit = bru(cmp, lik)The LGCP Model
\[ \begin{aligned} \color{#FF6B6B}{\boxed{p(\mathbf{y} | \lambda)}} & \propto \exp \left( -\int_\Omega \lambda(\mathbf{s}) p(\mathbf{s}) \mathrm{d}\mathbf{s} \right) \prod_{i=1}^n \lambda(\mathbf{s}_i) p(\mathbf{s}_i)) \\ \eta(s) & = \beta_0 + \omega(s) + \log p(s) \\ \end{aligned} \]
The code
# define model component
cmp = ~ Intercept(1) +
space(main = geometry, model = spde_model) +
sigma(1,
prec.linear = 1,
marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = 1 / 8)
)
# define model predictor
eta = geometry + distance ~ space +
log(hn(distance, sigma)) +
Intercept + log(2)
# build the observation model
lik = bru_obs("cp",
formula = eta,
data = mexdolphin$points,
ips = ips)
# fit the model
fit = bru(cmp, lik)We can to obtain posterior summaries of the model parameters.
| mean | 0.025quant | 0.975quant | |
|---|---|---|---|
| Intercept | −8.41 | −9.47 | −7.62 |
| sigma | −0.05 | −0.46 | 0.36 |
| Range for space | 131.74 | 41.79 | 320.28 |
| Stdev for space | 1.17 | 0.72 | 1.78 |
We can to obtain posterior summaries of the model parameters.
| mean | 0.025quant | 0.975quant | |
|---|---|---|---|
| Intercept | −8.41 | −9.47 | −7.62 |
| sigma | −0.05 | −0.46 | 0.36 |
| Range for space | 131.74 | 41.79 | 320.28 |
| Stdev for space | 1.17 | 0.72 | 1.78 |
To map the spatial intensity we first need to define a grid of points where we want to predict.
fm_pixel() which creates a regular grid of points covering the meshpredict function which takes as input
fit)pxl)ggplot and add a gg() layer with your output of interest (E.g., pr.int$spatial)We can also use the predict function to predict the detection probabilities:
47 groups were seen. How many would be seen along the transects under perfect detection?
| mean | sd | q0.025 | q0.5 | q0.975 | median | sd.mc_std_err | mean.mc_std_err |
|---|---|---|---|---|---|---|---|
| 96.15 | 28.65 | 57.27 | 88.26 | 160.56 | 88.26 | 2.24 | 3.31 |
How many would be seen under perfect detection across the whole study area (i.e., the mean expected number of dolphins)?
| mean | sd | q0.025 | q0.5 | q0.975 | median | sd.mc_std_err | mean.mc_std_err |
|---|---|---|---|---|---|---|---|
| 325.12 | 95.72 | 199.67 | 302.79 | 557.35 | 302.79 | 8.53 | 11.28 |
What’s the predictive distribution of group counts?
We can also get Monte Carlo samples for the expected number of dolphins as follows:
Ns <- seq(50, 450, by = 1)
Nest <- predict(fit, predpts,
~ data.frame(
N = Ns,
density = dpois(
Ns,
lambda = sum(weight * exp(space + Intercept))
)
),
n.samples = 2000
)
Nest <- dplyr::bind_rows(
cbind(Nest, Method = "Posterior"),
data.frame(
N = Nest$N,
mean = dpois(Nest$N, lambda = Lambda$mean),
mean.mc_std_err = 0,
Method = "Plugin"
)
)Point process are a stochastic processes that describe the locations where events occur
Unlike geostatistical data where the locations are fixed, here the locations have a stochastic nature the locations are our data!
CSR as a realisation of an HPP that describe events that occur independently and uniformly at random across space, such that the number of events in any region follows a Poisson distribution with mean \(\lambda \times \text{area}\).
K functions can be used to distinguish between CSR, spatial clustering or regular point patterns.
IPP allows the intensity of the point process to vary across space through spatially varying covariates.
Numerical integration schemes are required to estimate the parameters of an IPP
LGCP are a double stochastic process that extend IPP models by allowing the intensity function to vary spatially according to a structured spatial random effect
Thinned Point Processes offer improved accuracy by accounting the observational process of how individuals are detected
“It is that range of biodiversity that we must care for — the whole thing — rather than just one or two stars”
Sir David Attenborough